We present exact algorithms for identifying deterministic-actions effects andpreconditions in dynamic partially observable domains. They apply when one doesnot know the action model(the way actions affect the world) of a domain andmust learn it from partial observations over time. Such scenarios are common inreal world applications. They are challenging for AI tasks because traditionaldomain structures that underly tractability (e.g., conditional independence)fail there (e.g., world features become correlated). Our work departs fromtraditional assumptions about partial observations and action models. Inparticular, it focuses on problems in which actions are deterministic of simplelogical structure and observation models have all features observed with somefrequency. We yield tractable algorithms for the modified problem for suchdomains. Our algorithms take sequences of partial observations over time as input, andoutput deterministic action models that could have lead to those observations.The algorithms output all or one of those models (depending on our choice), andare exact in that no model is misclassified given the observations. Ouralgorithms take polynomial time in the number of time steps and state featuresfor some traditional action classes examined in the AI-planning literature,e.g., STRIPS actions. In contrast, traditional approaches for HMMs andReinforcement Learning are inexact and exponentially intractable for suchdomains. Our experiments verify the theoretical tractability guarantees, andshow that we identify action models exactly. Several applications in planning,autonomous exploration, and adventure-game playing already use these results.They are also promising for probabilistic settings, partially observablereinforcement learning, and diagnosis.
展开▼